Introduction of Fingerprint Vector based Bayesian Method for Spam Filtering

نویسندگان

  • Bin Chen
  • Shoubin Dong
  • Weidong Fang
چکیده

With the development of the diversification of spam, it raises the difficulties and challenges to content-based spam filtering. To address this problem, this paper firstly introduced the statistical features of Email headers, and then proposed a method to use these features to improve Bayesian anti-spam filter. The selected Email-header features are presented as the fingerprint vectors, and then transformed to the input tokens to Bayesian filter. It shows that this method efficiently utilizes the messages embedded in Email headers and then improves the performance of the Bayesian anti-spam filtering.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Detecting Image Spam Using Image Texture Features

Filtering image email spam is considered to be a challenging problem because spammers keep modifying the images being used in their campaigns by employing different obfuscation techniques. Therefore, preventing text recognition using Optical Character Recognition (OCR) tools and imposing additional challenges in filtering such type of spam. In this paper, we propose an image spam filtering tech...

متن کامل

Email Shape Analysis

Email has become an integral part of everyday life. Without a second thought we receive bills, bank statements, and sales promotions all to our inbox. Each email has hidden features that can be extracted. In this paper, we present a new mechanism to characterize an email without using content or context called Email Shape Analysis. We explore the applications of the email shape by carrying out ...

متن کامل

The Open University ’ s repository of research publications and other research outputs Email shape analysis

Email has become an integral part of everyday life. Without a second thought we receive bills, bank statements, and sales promotions all to our inbox. Each email has hidden features that can be extracted. In this paper, we present a new mechanism to characterize an email without using content or context called Email Shape Analysis. We explore the applications of the email shape by carrying out ...

متن کامل

Survey of Spam Filtering Techniques and Tools, and MapReduce with SVM

Abstract Spam is unsolicited, junk email with variety of shapes and forms. To filter spam, various techniques are used. Techniques like Naïve Bayesian Classifier, Support Vector Machine (SVM) etc. are often used. Also, a number of tools for spam filtering either paid or free are available. Amongst all techniques SVM is mostly used. SVM is computationally intensive and for training it can’t work...

متن کامل

Support Vector Machines Parameter Selection Based on Combined Taguchi Method and Staelin Method for E-mail Spam Filtering

Support vector machines (SVM) are a powerful tool for building good spam filtering models. However, the performance of the model depends on parameter selection. Parameter selection of SVM will affect classification performance seriously during training process. In this study, we use combined Taguchi method and Staelin method to optimize the SVM-based E-mail Spam Filtering model and promote spam...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008